When a Model Is 5× Faster, It’s No Longer the Same Model
The Gemini 3.5 Flash launch barely talked about intelligence; Zhipu’s GLM-5.1 High-Speed edition hit 400 token/s. Both tell the same story—once inference speed crosses that 5× threshold, the model unlocks an entirely different product category.
6 min read
